AITopics | gradient penalty

Collaborating Authors

gradient penalty

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Neglected Hessian component explains mysteries in sharpness regularization

Neural Information Processing SystemsMar-22-2026, 20:10:27 GMT

Recent work has shown that methods that regularize second order information like SAM can improve generalization in deep learning. Seemingly similar methods like weight noise and gradient penalties often fail to provide such benefits. We investigate this inconsistency and reveal its connection to the the structure of the Hessian of the loss. Specifically, its decomposition into the positive semi-definite Gauss-Newton matrix and an indefinite matrix, which we call the Nonlinear Modeling Error (NME) matrix. Previous studies have largely overlooked the significance of the NME in their analysis for various reasons. However, we provide empirical and theoretical evidence that the NME is important to the performance of gradient penalties and explains their sensitivity to activation functions. We also provide evidence that the difference in regularization performance between gradient penalties and weight noise can be explained by the NME. Our findings emphasize the necessity of considering the NME in both experimental design and theoretical analysis for sharpness regularization.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Genre: Research Report (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Neglected Hessian component explains mysteries in sharpness regularization

Neural Information Processing SystemsFeb-18-2026, 15:13:05 GMT

SAM can improve generalization in deep learning. Seemingly similar methods like weight noise and gradient penalties often fail to provide such benefits. We investigate this inconsistency and reveal its connection to the the structure of the Hessian of the loss.

artificial intelligence, information, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Quality Aware Generative Adversarial Networks

KANCHARLA PARIMALA, Sumohana Channappayya

Neural Information Processing SystemsFeb-13-2026, 18:16:55 GMT

Alargenumber ofveryinteresting anduseful applications have emerged due to the ability of GANs to learn complex real-world distributions.

artificial intelligence, machine learning, regularizer, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

How to Train Your Latent Control Barrier Function: Smooth Safety Filtering Under Hard-to-Model Constraints

Nakamura, Kensuke, Bishop, Arun L., Man, Steven, Johnson, Aaron M., Manchester, Zachary, Bajcsy, Andrea

arXiv.org Artificial IntelligenceNov-25-2025

Latent safety filters extend Hamilton-Jacobi (HJ) reachability to operate on latent state representations and dynamics learned directly from high-dimensional observations, enabling safe visuo-motor control under hard-to-model constraints. However, existing methods implement "least-restrictive" filtering that discretely switch between nominal and safety policies, potentially undermining the task performance that makes modern visuomotor policies valuable. While reach-ability value functions can, in principle, be adapted to be control barrier functions (CBFs) for smooth optimization-based filtering, we theoretically and empirically show that current latent-space learning methods produce fundamentally incompatible value functions. We identify two sources of incompatibility: First, in HJ reachability, failures are encoded via a "margin function" in latent space, whose sign indicates whether or not a latent is in the constraint set. However, representing the margin function as a classifier yields saturated value functions that exhibit discontinuous jumps. We prove that the value function's Lipschitz constant scales linearly with the margin function's Lipschitz constant, revealing that smooth CBFs require smooth margins. Second, reinforcement learning (RL) approximations trained solely on safety policy data yield inaccurate value estimates for nominal policy actions, precisely where CBF filtering needs them. We propose the LatentCBF, which addresses both challenges through gradient penalties that lead to smooth margin functions without additional labeling, and a value-training procedure that mixes data from both nominal and safety policy distributions. Experiments on simulated benchmarks and hardware with a vision-based manipulation policy demonstrate that LatentCBF enables smooth safety filtering while doubling the task-completion rate over prior switching methods.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2511.18606

Country: North America > United States (0.46)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Improved Training of Wasserstein GANs

Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron C. Courville

Neural Information Processing SystemsNov-21-2025, 11:38:21 GMT

Neural Information Processing Systems http://nips.cc/

architecture, arxiv preprint arxiv, generator, (12 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

Neglected Hessian component explains mysteries in sharpness regularization

Neural Information Processing SystemsOct-10-2025, 20:50:35 GMT

derivative, information, penalty, (15 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

NS-Pep: De novo Peptide Design with Non-Standard Amino Acids

Guo, Tao, Yin, Junbo, Wang, Yu, Gao, Xin

arXiv.org Artificial IntelligenceOct-7-2025

Peptide drugs incorporating non-standard amino acids (NSAAs) offer improved binding affinity and improved pharmacological properties. However, existing peptide design methods are limited to standard amino acids, leaving NSAA-aware design largely unexplored. We introduce NS-Pep, a unified framework for co-designing peptide sequences and structures with NSAAs. The main challenge is that NSAAs are extremely underrepresented-even the most frequent one, SEP, accounts for less than 0.4% of residues-resulting in a severe long-tailed distribution. To improve generalization to rare amino acids, we propose Residue Frequency-Guided Modification (RFGM), which mitigates over-penalization through frequency-aware logit calibration, supported by both theoretical and empirical analysis. Furthermore, we identify that insufficient side-chain modeling limits geometric representation of NSAAs. To address this, we introduce Progressive Side-chain Perception (PSP) for coarse-to-fine torsion and location prediction, and Interaction-Aware Weighting (IAW) to emphasize pocket-proximal residues. Moreover, NS-Pep generalizes naturally to the peptide folding task with NSAAs, addressing a major limitation of current tools. Experiments show that NS-Pep improves sequence recovery rate and binding affinity by 6.23% and 5.12%, respectively, and outperforms AlphaFold3 by 17.76% in peptide folding success rate.

artificial intelligence, machine learning, peptide, (19 more...)

arXiv.org Artificial Intelligence

2510.03326

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education > Health & Safety > School Nutrition (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Sobolev Independence Criterion

Youssef Mroueh, Tom Sercu, Mattia Rigotti, Inkit Padhi, Cicero Nogueira dos Santos

Neural Information Processing SystemsOct-3-2025, 01:31:52 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, sic, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bounds all around: training energy-based models with bidirectional bounds

Neural Information Processing SystemsAug-16-2025, 14:11:48 GMT

Recent work has established links to generative adversarial networks, where the EBM is trained through a minimax game with a variational value function.

artificial intelligence, international conference, machine learning, (14 more...)

Neural Information Processing Systems

Country: